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ABSTRACT 

This paper deals with the estimation of one-dimensional 
Gaussian mixture. Given a set of observations of a K- 
component Gaussian mixture, we focus on the estimation 
of the component expectations. The number of components 
is supposed to be known. Our method is based on a spec- 
tral analysis of the estimated first characteristic function. 
We construct a Toeplitz matrix Rjv/ with 2M — 1 estimated 
samples of the first characteristic function and show that the 
mixture component expectations can be derived from the 
eigenvector decomposition of Ra/. Simulations illustrate 
the performance of our algorithm on several configurations 
of a six-component Gaussian mixture. In the investigated 
scenarios the proposed method outperforms the Expectation- 
Maximization algorithm. 
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1. INTRODUCTION 

In this paper we deal with Gaussian mixture estimation. 
Given a set of one-dimensional observations originating from 
K possible Gaussian components, we focus on the estimation 
of the component expectations. The number of components 
is supposed to be known, and the component expectations are 
supposed to be all different. 

One method consists in estimating a sampling of the ob- 
servations probability density function (pdf), a mixture of K 
pdf, by associating a kernel to each observation and adding 
the contribution of all the kernels [[Tj. A search of the pdf 
modes then leads to the component expectations. The draw- 
back of such a method is that it requires the selection of extra- 
parameters (kernel design, sampling intervals). Furthermore, 
the final mode search algorithm might fail because of spurious 
local maxima in the estimated pdf. 

An alternative method consists in using the Expectation- 
Maximization (EM) algorithm [2!|. Each EM iteration con- 
sists of two steps. The Expectation step estimates the proba- 
bility for each observation to come from each mixture com- 
ponent. Then, during the Maximization step, these estimated 
probabilities are used to update the estimation of the mixture 
parameters. This procedure converges to a local maximum 



of the likelihood. The main drawback of the EM algorithm 
is the potential convergence to some local non-global max- 
ima of the likelihood. Some solutions consist, for instance, in 
using smart initializations or stochastic optimization [j3j. 

In this contribution we propose a new approach based on 
a spectral analysis of the first characteristic function (CF). We 
define a Toeplitz matrix R^/ with 271/ — 1 estimated samples 
of the CF and show that the mixture component expectations 
can be estimated from an eigenvector decomposition of Ra/. 
The proposed method is strongly inspired from the ;7iMltiple 
i/gnal classification algorithm MUSIC H) which aims at es- 
timating the frequencies in a sum of sinusoids. The paper is 
organized as follow: In section 2 the observation model is 
presented and an analytical expression of the CF of a Gaus- 
sian mixture is given. In section 3 the matrix Ra/ is defined 
and some properties of Rm are described. Section 4 then 
presents the complete estimation algorithm. Section 5 illus- 
trates the estimation performances on a six-component Gaus- 
sian mixture with different configurations. Conclusions are 
finally given in section 6, as well as perspectives for using the 
proposed method to estimate the number of components in a 
mixture. 



2. GAUSSIAN MIXTURE 

2.1. Probability density function (pdf) 

Let {pk}ke{i - K} be a set of K positive mixing weights that 
sum up to one. The multimodal pdf of the random observable 
variable Z is a finite mixture given by: 
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(1) 
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whereg(z, a^, au) is the Gaussian pdf given by: g{z, Uk, (Jk) 



-exp 



and Ofc and ak are respectively the ex- 



pectation and the standard deviation of component k. Given 
a set of N observed realizations {zn}ne{i.- - ,n} of Z we 
focus on the estimation of the K component expectations 
{ifc}fee{i, -- .K}- Our proposal is mainly based on the esti- 
mated first characteristic function (CF) of the mixture. 



2.2. First characteristic function (CF) 

In general, the CF of a random variable X is defined by: 



(2) 



where Ex{-} is the mathematical expectation with respect to 
the pdf of X. For instance, the CF of a Gaussian random 
variable X with pdf g{x, a, a) is given by IS): 
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Consequently the CF of the random variable Z with the pdf 
described in ([T]i is given by: 

^ />oo 

<^2W = / e'*^5(2;,afe,f7A;)dz (5) 

fc=l Jz = -oo 
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Now let 0m be the sampled version of cf>z (t) with a sampling 
period Tg. According to (|6]l, we have: 
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where Wk and a^.m are defined by: 
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In practical situation, 0,„ can be estimated from a set of N 
observations ^at using: 
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n=l 



In section [3] we will show how the {wk}k=i, --,K defined 
in (|9]l can be estimated from the sampled CF. Once the 
{wk}k=i,- - .K are estimated, the {ak}k=i,- - .k can be ob- 
tained without ambiguity if the sampling period Tg is less 
than 



i — . I — !-: If we for instance choose: 
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2(max{z„} - min{z„}) ' 



(12) 



then ^ ~ 2(max{2;„} — min{2;„}). Since min{z„} < ak < 
max{z„} there is exactly one integer 1^,^. such as: 

angle(u;fc) 2n 

7f l-'u.fc7^ e [min{2;„} max{z„}J, (13) 



and we have: 

angle('u;fc) 
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Furthermore, if Tg verifies ( fT2] i. and since the {ak}k=i,- - .k 
are supposed to be all different, then the {wk}k=i.- - ,k are 
also all different. This will be used in section [3j 



3. DEFINITION AND PROPERTIES OF R 



M 



Let Rjv/ = {rji)j^i=i,... e C'^'^^-' be the Toeplitz matrix 
with the following elements: 

],l^l,---,M (15) 



where has been defined in (|7]). Note that (/)_ 
Ra/ is a Hermitian matrix and one only has to compute M 
samples of the CF to build Ri\/. Including ^ into ( fTsl i: 
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where in ( fTsT i we used that 
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"^k^k ~ "^i- ""^ 



smce if^ ^^A; " = 1- A consequence of ( fTSI ) is that Ra/ 
can be expressed as the sum of a "signal" matrix Sa/ and a 
"perturbation" matrix Pa/: 
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A/ — ^Af + A/, 
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where the "signal" matrix Sm is given by: 



Sa^ = (wi,... ,wk)D I : I ^C^'^'' (20) 
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(21) 

D = diag(pi,... ,pA') GR^'^^ (22) 
and the "perturbation" matrix Pa/ is given by: 

K 



M 



Y^Pk (K,_,-i)u{r^) G 
fc=i '■' 



(23) 

The "signal" matrix S a/ is a well-known matrix in the spec- 
tral analysis community. It is the auto-correlation matrix of a 
received sum of K sinusoids with angular frequencies ak and 



power pfe. High resolution algorithm such as MUSIC ||4] es- 
timate Sm from some (potentially corrupted) signal samples 
then estimate the sinusoid frequencies from its eigenvector 
decomposition. Indeed, since the Wk are all different (sec- 
tion [2]2}, one can show that the rank of Sm is equal to K 
and that the signal vectors defined in i2l[ are orthogo- 
nal to any vector of the kernel of Sm (H). Consequently, if 
V = (vA'+i, • • • , VA/) G C'^>'M-K contains M - K or- 
thogonal eigenvectors belonging to Ker{Sj\/} we have: 

wf VV^Wfc = 0, fc = 1, • • ■ , A". (24) 

A consequence of (|24] | is that if tj denotes the sum of the jf/i 
diagonal of VV^ (j G {-M + 1, • • • , M - 1} and to = 
tracejVV^}) and if q{y) is the polynomial defined by; 

A/-1 

] = -M+l 

then the zeros of q{y) exhibit inverse symmetry with respect 
to the unit circle, and q{y) exactly has K zeros on the unit 
circle, equal to {wk}k=i,- - ,k (see Q for a detailed proof). 

In our Gaussian mixture estimation case, the "signal" ma- 
trix S^/ ( |20l i is corrupted with the "perturbation" matrix P^/ 
(|23] l. When all the component variances tend to zero (ideal 
case) the perturbation matrix Pm tends to a null matrix: us- 
ing ([Tol l and (l23T l we have: 

lim ak,i-j ~ 1, fc = 1, • ■ • , if (26) 
lim Pa/ = Oa/xm- (27) 

(au--- ,crjf)-»(0,--- ,0) 

Yet, in the general case. Pa/ is not null and unfortunately 
depends on Wk- 

4. ESTIMATION ALGORITHM 

The proposed algorithm for estimating the set of component 
expectations is based on the eigenvector decomposition of the 
estimation of Ra/ ( fTST i, thus neglecting the effect of the per- 
turbation matrix Pm (l23] l. Given a set of N observations 
{zn}ne{i. - N} the algorithm steps are the following: 

1 . define a sampling period Tg using (fT2] i 

2. estimate AiQ samples {M > K) of the CF using (fTTT i 

3. build the matrix Ra/ using dlSt 

4. perform a eigenvector decomposition of Ra/ 

5. construct the matrix V = (va'+i, • • • , va/) with the 
M — K eigenvectors associated to the M — K smallest 
eigenvalues of R a/ 

'M = 2K seems to be a good choice from our simulations but more 
investigations are needed to optimize the value of M. 
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Table 1. Means, variances, weights of the simulated mixture 

6. calculate the coefficient of q{y) defined in dZSl ) 

7. calculate the roots of q{y), keep the roots inside the 
unit circle then identify the K roots that are closest to 
the unit circle, call them {wk}k=i,- - ,k 

8. derive {ak}k=i, -- ,k from {wk]k=i,- ,k using ^ 

5. SIMULATION 

In our simulations several types of a six-component Gaussian 
mixture have first been considered. The set of expectations is 
equal to (0, 1, 2, 4, 5, 6), with a difference of one or two be- 
tween two successive component expectations. Four cases 
have been studied: common variance and common weight 
(scenario 1), different variances and common weight (sce- 
nario 2), common variance and different weights (scenario 3) 
and different variances and different weights (scenario 4). A 
summary of the scenarios is given in Table 1 . The parameter a 
in Table 1 enables to simulate different overlapping situation. 
The number N of observations per simulation run is 200. The 
RA/-based algorithm has been run as described in section |4] 
with M = This algorithm has been compared to the 

EM algorithm ||2l with a uniform random start and a maximal 
number of 100 iterations. See [S) for a detailed description of 
the Gaussian mixture estimation with EM. A constrained ver- 
sion of the EM (EMc) which imposes a common variance and 
a common mixing weight has been used to prevent the con- 
vergence to components with an almost null variance. In all 
the scenario, EMc provides better estimates than the standard 
EM, even in the scenario where the component variances or 
the component weights are different. Therefore only the per- 
formances of EMc are presented here. To get rid of the permu- 
tation ambiguity, the estimation performance is evaluated as 
follows: If a G is the vector of the true component expec- 
tations and kr G is the vector of the estimated component 
expectations at simulation run r, the performance criterion 
is defined as the maximal absolute distance between the true 
and estimated ordered vector of component expectations: 

er = ||sort(a) - sort(ar)||^ , 
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Fig. 1. Performances of the constrained EM (EMc, dotted 
lines) and Km based algorithm with M = 2K = 12 (full 
lines) for different values of a. For each value of cr and for 
each scenario 10000 simulation runs have been performed. 
The performance criteria are the probabilities for to be 
smaller than 0.1 (top), and to be smaller than 0.2 (bottom). 

where sort(x) is the ordered permutation of x and |I • is the 
infinity norm in K^^. 

The simulation results are presented in Figure 1 for dif- 
ferent values of a. When a is greater than 0, there is a risk 
that the constrained EM converges to a wrong set of estimated 
component expectations. Typically one estimated component 
expectation is located in the middle of two true component 
expectations. For instance, for cr = 0.1, EM^ provides a good 
set of estimates for only 40% of the run. On the contrary, 
the R7\/-based algorithm provides a perfect set of estimates if 
a < 0.1 and remains less than 0.2 if cr < 0.2. In all the 
investigated scenario and for all the values of cr, the proposed 
method outperforms the EMc algorithm. 

6. CONCLUSION 

Given a set of observations originating from a A'-component 
univariate mixture, we focused on the estimation of the com- 
ponent expectations when the number K of components is 
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Fig. 2. eigenvalue decomposition of the matrix Rm with 
M ~ 10 in scenario 4 (6 mixture components) for cr — 0.15. 

known. We proposed a method based on the eigenvector de- 
composition of a Toeplitz matrix R^/ built from some esti- 
mated samples of the first characteristic function. Simulations 
illustrated the superiority of the proposed method compared 
with the Expectation-Maximization algorithm on various con- 
figurations of a six-component Gaussian mixture. More theo- 
retical investigations are now needed to study the influence of 
the perturbation matrix ( |23] ) on the performances. 

Our current research also deals with the case of an un- 
known number of components. In figure 2 we plot the eigen- 
values of Rm with M = 10 obtained in scenario 4 of Table 
1 (where the mixture components have different weights and 
variances) with N ~ 200 observations and a = 0.15. One 
can see that K ^ 6 eigenvalues are clearly greater than 
while the M — K ^ 4 other eigenvalues are almost null. In 
general, one can therefore expect the eigenvalue decomposi- 
tion of the matrix Rm to provide relevant information on the 
number of components in an observed mixture. 
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